Sparse Matrix Block-Cyclic Redistribution
نویسندگان
چکیده
Run-time support for the CYCLIC(k) redistribution on the SPMD computation model is presently very relevant for the scientific community. This work is focused to the characterization of the sparse matrix redistribution and its associate problematic due to the use of compressed representations. Two main improvements about the buffering and the coordinates calculation modify the original algorithm. Our solutions contain a Collecting, a Communication and Mixing stage with different influence in the execution time depending on the sparsity of the matrix and the number of processors. Experimental results have been carried out on a Cray T3E for real matrices and different redistribution parameters.
منابع مشابه
GAMS Index for the NAG Parallel Library
C Elementary and special functions (search also class L5 ) C1 Integer-valued functions (e.g., factorial, binomial coefficient, permutations, combinations, floor, ceiling) C06GXFP Factorizes a positive integer n as n = n1 × n2. This routine may be used in conjunction with C06MCFP D Linear Algebra D1 Elementary vector and matrix operations D1a Elementary vector operations D1a1 Set to constant D1a...
متن کاملSparse Block and Cyclic Data Distributions for Matrix Computations
A significant part of scientific codes consist of sparse matrix computations. In this work we propose two new pseudoregular data distributions for sparse matrices. The Multiple Recursive Decomposition (MRD) partitions the data using the prime factors of the dimensions of a multiprocessor network with mesh topology. Furthermore, we introduce a new storage scheme, storage-by-row-of-blocks, that s...
متن کاملAlgorithmic Redistribution Methods for Block-Cyclic Decompositions
This research aims at creating and providing a framework to describe algorithmic redistribution methods for various block cyclic decompositions. To do so properties of this data distribution scheme are formally exhibited. The examination of a number of basic dense linear algebra operations illustrates the application of those properties. This study analyzes the extent to which the general two-d...
متن کاملBCYCLIC: A parallel block tridiagonal matrix cyclic solver
A block tridiagonal matrix is factored with minimal fill-in using a cyclic reduction algorithm that is easily parallelized. Storage of the factored blocks allows the application of the inverse to multiple right-hand sides which may not be known at factorization time. Scalability with the number of block rows is achieved with cyclic reduction, while scalability with the block size is achieved us...
متن کاملMulti-phase array redistribution: modeling and evaluation
s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 s t lcm lcm*2 lcm*4 gcd gcd/2 gcd/4 Table 1: Execution times (ms) for cyclic(s) to cyclic(t) redistribution on 32 processors. other block sizes t. Fig. 3 shows the total times in milliseconds for a cyclic(192) to cyclic(8) redistribution on 32 processors for increasing data sizes. This redistribution corresponds to the cyclic(Y t) to cyclic(t) case with Y = 2...
متن کامل